ENSEMBL ftp structure

FTP Site Directories Ref Link

The FTP directory has the following basic structure, although not all information is neccessarily available for each species.

   |-- embl	      Gene predictions annotated on genomic DNA slices of 1 Mb in EMBL format.
   |   | 
   |   |-- species
   |
   |
   |-- emf     	      Alignment dumps in EMF format
   |   | 
   |   |-- pecan                  * Pecan whole genome multiple alignments
   |   |                            with conservation scores for selected sets
   |   |-- ensembl_compara        * protein trees and protein multiple alignments
   |   |                            underlying orthologue/paralogue predictions
   |   |--species_variation       * resequencing data
   |
   | 
   |-- fasta	      Gene predictions in FASTA datatabase format
   |   |
   |   |--species
   |      |
   |      |-- cdna         * Transcript (cDNA) predictions
   |      |-- dna          * Genomic DNA in assembled entities
   |      |-- pep          * Translation (peptide) predictions
   |      |-- rna          * Non-coding RNA predictions
   |
   |            
   |-- genbank	      Gene predictions annotated on genomic DNA slices of 1 Mb in GenBank format.
   |   | 
   |   |-- species
   |
   |
   |-- gtf	      Gene annotation in GTF format
   |
   |-- mysql          MySQL database table text dumps
       |
       |-- core       General genome annotation information
       |
       |                * Genome sequence assembly
       |                * Ensembl gene predictions
       |                * Ab initio gene predictions
       |                * Marker information
       |                * ...
       |
       |-- otherfeatures  Additional genome annotation
       |
       |                * Gene predictions based on EST information
       |                * ...
       |
       |-- variation  Genetic variation information
       |-- vega       Manually curated gene sets
       |-- cdna       cDNA to genome alignments based on the latest EMBL database
       |
       |-- ensembl_compara      Cross-species comparative genomics data:
       |
       |                          * Orthologue/paralogue predictions
       |                          * Protein families
       |                          * Whole genome alignments
       |                          * Synteny information
       |
       |-- ensembl_go           Gene Ontology database
       |
       |-- ensembl_web_user_db  SQL table defintion for server-side user config database
       |
       |-- ensembl_website        Ensembl web site database:
       |                            * Context-sensitive help articles
       |                            * News articles
       |                            * Mini-ads
       |-- compara_mart_homology     Orthologue/paralogue predictions
       |
       |-- compara_mart_multiple_ga  Constrained elements predictions
       |
       |-- compara_mart_pairwise_ga  Whole genome pairwise alignments
       | 
       |-- ensembl_mart              Cross-species data mining tables
       |
       |-- genomic_features_mart     Clone data sets
       |
       |-- sequence_mart             Genome sequences
       |
       |-- snp_mart                  Genetic variation information
       |
       |-- vega_mart                 Manually curated gene sets

Homepage
Comments

Hide Comments